Least Absolute Policy Iteration-A Robust Approach to Value Function Approximation

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Least Absolute Policy Iteration--A Robust Approach to Value Function Approximation

Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efficiency. However, it tends to be sensitive to outliers in observed rewards. In this paper, we propose an alternative method that employs the absolute loss for enhancing robustness and reliability. The proposed method is formulated as a linear programming problem which can be solved e...

متن کامل

A uniform approximation method to solve absolute value equation

In this paper, we propose a parametric uniform approximation method to solve NP-hard absolute value equations. For this, we uniformly approximate absolute value in such a way that the nonsmooth absolute value equation can be formulated as a smooth nonlinear equation. By solving the parametric smooth nonlinear equation using Newton method, for a decreasing sequence of parameters, we can get the ...

متن کامل

Least-Squares Policy Iteration

We propose a new approach to reinforcement learning for control problems which combines value-function approximation with linear architectures and approximate policy iteration. This new approach is motivated by the least-squares temporal-difference learning algorithm (LSTD) for prediction problems, which is known for its efficient use of sample experiences compared to pure temporal-difference a...

متن کامل

a uniform approximation method to solve absolute value equation

in this paper, we propose a parametric uniform approximation method to solve np-hard absolute value equations. for this, we uniformly approximate absolute value in such a way that the nonsmooth absolute value equation can be formulated as a smooth nonlinear equation. by solving the parametric smooth nonlinear equation using newton method, for a decreasing sequence of parameters, we can get the ...

متن کامل

Robust Modified Policy Iteration

Robust dynamic programming (robust DP) mitigates the effects of ambiguity in transition probabilities on the solutions of Markov decision problems. We consider the computation of robust DP solutions for discrete-stage, infinite-horizon, discounted problems with finite state and action spaces. We present robust modified policy iteration (RMPI) and demonstrate its convergence. RMPI encompasses bo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEICE Transactions on Information and Systems

سال: 2010

ISSN: 0916-8532,1745-1361

DOI: 10.1587/transinf.e93.d.2555